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A  bstra ct- Maximization  of  mutual  information  is  a  very 
powerful  criterion  for  3D  medical  image  registration,  allowing 
robust  and  accurate  fully  automated  rigid  registration  of  multi¬ 
modal  images  in  a  various  applications.  In  this  paper,  we 
presented  a  method  based  on  normalized  mutual  information 
with  sub-sampling  of  the  images  for  3D  image  registration  on 
the  images  of  CT,  MR  and  PET.  Powell’s  direction  set  method 
and  Brent’s  one-dimensional  optimization  algorithm  were  used 
as  optimization  strategy.  A  multi-resolution  approach  was 
applied  to  speedup  the  matching  process.  For  PET  images,  pre¬ 
processing  of  segmentation  was  performed  to  reduce  the 
background  artifacts.  According  to  the  evaluation  by 
Vanderbilt  University,  the  average  of  mean  of  registration  error 
for  CT-MR  task  was  1.47  mm  and  for  MR-PET  task  was  3.22 
mm.  The  registration  images  with  edge  extraction  showed  good 
matches  by  visual  inspection.  Sub-voxel  accuracy  in  multi¬ 
modality  registration  had  been  achieved  with  this  algorithm. 

I.  Introduction 

Imaging  registration  is  a  prerequisite  to  medical  imaging. 
Each  imaging  modality  provides  different  and  useful 
information.  Anatomical  modalities  depict  primarily 
morphology,  such  as  X-ray,  CT  (Computed  Tomography), 
MRI  (Magnetic  Resonance  Imaging),  US  (Ultrasound), 
acquired  by  various  image  systems.  Functional  modalities 
depict  primarily  information  on  the  metabolism  but  have  no 
enough  information  about  anatomy,  including  y  camera, 
SPECT  (Single  Photon  Emission  Computed  Tomography), 
PET  (Positron  Emission  Tomography).  Since  information 
obtained  from  different  time  on  same  modality  or  multiple 
modalities  is  usually  of  a  complementary  nature,  proper 
registration  is  desired  in  clinical  diagnoses,  radiotherapy 
planning  and  surgical  evaluation. 

There  are  two  kinds  of  multi-modality  image  registration 
methods — extrinsic  method  and  intrinsic  method  [1,2, 3, 4]. 
The  former  is  used  usually  as  gold  standard  for  registration 
evaluation.  The  later  only  relies  on  patient  generated  image 
content.  Maximization  of  mutual  information  has  been 
recommended  as  a  powerful  intrinsic  criterion  for  multi¬ 
modal  medical  image  registration.  The  method  applies  the 
concept  of  mutual  information  to  measure  the  statistical 
dependence  between  the  image  intensities  of  corresponding 
voxel  in  both  images,  which  is  assumed  to  be  maximal  if  the 
images  are  geometrically  aligned. 

In  this  paper,  we  presented  a  method  based  on  normalized 
mutual  information  with  sub-sampling  of  the  images  for  3D 
image  registration  of  CT,  MR  and  PET.  Powell’s  direction 
set  method  and  Brent’s  one-dimensional  optimization 
algorithm  were  used  as  optimization  strategy.  For  PET 


images,  pre-procession  of  segmentation  was  performed  to 
reduce  the  background  artifacts.  The  accuracy  was  validated 
to  be  sub-voxel  by  comparing  to  the  stereotactic  registration 
solution  according  to  the  evaluation  by  Vanderbilt  University. 

II.  Methodology 

A.  Image  Acquisition 

The  raw  data  of  images  of  18  patients  were  provided  by 
the  project  of  Vanderbilt  University,  entitled  “Evaluation  of 
Retrospective  Image  Registration”.  In  each  data  set  from 
patient_001  to  patient_009,  there  were  one  CT  data  and/or 
one  PET  data,  and  six  MR  data  (PD,  Tl,  T2,  PD  rectified, 
Tl_rectified,  T2_rectified),  which  were  all  low  resolution 
images.  From  patient_101  to  patient_109,  the  following  four 
image  volumes  of  high  resolution  were  included:  one  axial 
CT  and  three  axial  MR  Spin-Echo  (PD,  T 1  and  T2)  images. 

An  overview  of  the  resolution  of  the  data  sets  could  be 
found  in  Table  I.  The  raw  data  were  encoded  as  two-byte 
two's  complement  integers.  The  byte  order  was  BigEndian. 
All  data  sets  were  normalized  to  the  range  of  0-255  for 
processing. 

B.  Pre-processing  of  PET  Images 

As  blurs  occurred  in  PET  images,  the  PET  images  were 
pre-segmented  to  reduce  the  radiated  artifacts  [5]. 

C.  Sub-sampling  and  Mutual  Information  [6,7,8] 

One  of  the  images  was  selected  as  a  floating  image  F  and 
another  to  be  a  reference  image  R.  Rigid  body  transformation 
was  applied  for  multi-modal  images  of  head  because  it  is 
reasonable  to  assume  that  bone  of  the  skull  is  rigid.  The 
transformation  was  restricted  to  six  degrees  of  freedom  (three 
translations  and  three  rotations),  thus: 

vR  ■  (pR  -cR)=R(t)-my)-m:)-vF  ■  (pF-cF)+t(tx,ty,tz)  (i) 

where  V  was  a  3*3  diagonal  matrix  representing  the 
respective  voxel  size,  P  was  the  orientation  of  the  respective 
image,  C  was  the  image  center,  R  was  the  rotation  around 
three  axis,  <f>  was  the  rotation  angle  and  t  was  the  translation 
vector. 

The  images  were  not  pre-registered  other  than  having 
their  centers  aligned  and  their  axes  orientation  corresponding. 
Samples  were  taken  from  F  on  a  regular  grid  at  different 
sample  intervals  in  the  x,  y  and  z  direction  respectively  and 
transformed  by  the  geometric  transformation  into  the 
reference  image  R.  Then  the  joint  and  marginal  histograms  of 
the  intensities  /  and  r  of  corresponding  voxels  in  the  volume 
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of  overlap  of  F  and  R  were  constructed  and  from  which  the 
normalized  mutual  information  registration  criterion 
ECC(F,R)  was  computed. 

The  sampling  range  was  chosen  to  be  within  a 
20cm*20cm  square  around  the  centers.  The  equidistant 
sampling  was  controlled  by  sampling  factors:  setting  a 
sampling  factor  to  a  positive  integer  x  resulted  in  only  one 
out  of  every  x  voxels  along  an  image  axis  being  used  in  the 
computation.  The  sampling  factors  were  defined  separately 
for  each  dimension  corresponding  to  experience.  The  high 
resolution  images  were  sub-sampled  with  a  factor  of  3  in 
plane  and  1  out-of-plane.  The  low  resolution  images  were 
matched  in  a  coarse-to-fine  manner  and  two  layers  were  used. 
The  low  resolution  images  were  sampled  in  plane  by  factors 
3  and  out-of-plane  by  factors  1  in  the  first  level  and  by  factor 
1  in  each  dimension  in  the  second  level. 

If  the  corresponding  point  for  one  sample  after 
transformation  was  outside  the  reference  image,  the  intensity 
would  be  defined  to  be  equal  to  that  of  its  nearest  neighbor  on 
the  volume  edge. 

In  most  cases,  the  transformed  position  of  a  voxel  in  F 
was  not  coincided  exactly  with  a  voxel  position  in  R. 
Therefore,  interpolation  was  required.  Our  choice  was  cubic 
spline  interpolation  method. 

With  the  normalized  mutual  information  of  two  images 
(A  and  B),  the  Entropy  Correlation  Coefficient(ECC),  was 
defined  in  terms  of  the  entropies  H(A)  and  H(B)  of  the 
images,  combined  with  their  joint  entropy  H(A,B)  [9'10],  as 
follows: 


ECC(A,B)  = 


2(H(A)  +  H(B)-H(A,B)] 
H(A)  +  H(B) 


(2) 


and 


H(A)  =  ~Y,Pklogpk  (3) 

k= 1 


D.  Optimization  Strategy 

The  six  parameters  were  optimized  by  Powell’s  multi¬ 
dimensional  direction  set  method,  combined  with  Brent’s 
one-dimensional  optimization  algorithm  to  maximize 
ECC(F,R) .  The  initial  values  of  all  parameters  were  set  to 
zero,  and  the  initial  direction  matrices  were  set  to  unit  vectors. 
The  best  searching  sequence  of  parameters  was  proved  to  be 

(ft  '  ft  ■  0-  '  0y  '  0V  A  v)  ' 


E.  Evaluation  Method 


There  was  no  really  a  "gold  standard"  for  accuracy  of 
medical  image  registration,  but  a  prospective  method  based 
on  fiducial  markers  could  be  taken  as  a  "gold  standard"  to 
perform  an  objective,  blinded  evaluation  of  the  accuracy  of 
retrospective  image-to-image  registration  techniques.  Image 
volumes  of  three  modalities!  CT,  MR  and  PET)  were  taken  of 
patients  undergoing  neurosurgery  at  Vanderbilt  University 
Medical  Center.  These  volumes  had  all  traces  of  the  fiducial 
markers  removed,  and  were  provided  to  project  collaborators 
outside  Vanderbilt,  who  performed  retrospective  registrations 


on  the  volumes,  calculating  transformations  from  CT  to  MR 
and/or  from  PET  to  MR,  and  communicated  their 
transformations  to  Vanderbilt  where  the  accuracy  of  each 
registration  was  evaluated.  In  the  evaluation,  the  accuracy 
was  measured  at  multiple  "regions  of  interest",  i.e.  areas  in 
the  brain  which  would  commonly  be  areas  of  neurological 
interest.  A  region  was  defined  in  the  MR  image  and  its 
centroid  C  was  determined.  The  prospective  registration  was 
used  to  obtain  the  corresponding  point  C'  in  CT  or  PET.  To 
this  point  the  retrospective  registration  was  then  applied, 
producing  C"  in  MR.  Statistics  were  gathered  on  the  target 
registration  error,  which  was  the  disparity  between  the 
original  point  C  and  its  corresponding  point  C". 

This  study  was  carried  out  in  a  blinded  fashion,  in  the 
sense  that  the  investigators  at  sites  outside  Vanderbilt  did  not 
know  the  standard  results  and  the  researchers  did  not  know 
the  exact  registration  algorithm. 

Table i 


DESCRIPTION  OF  THE  IMAGE  VOLUMES 


Modality 

Voxel 

Dimensions 

Image  Dimensions 

(mm) 

Data  Range 

(approx  ) 

Low 

resolution: 

CT 

5122*[28~34] 

0.652*4.0 

-1024-2000 

MR 

2  5  62*[20~26] 

[  1.25-1. 28]2*[4.0~ 

0-2000 

PET 

1282*15 

4.16] 

2.592*8.0 

-128-1200 

High 

resolution: 

CT 

5122*[40~49] 

[0.40~0.45]2*3 

-1024-2000 

MR  PD 

2  562*[51~52] 

[0.78~0.86]2*3 

0-4000 

MRT1 

2  5  62*52 

[0.78-0. 86]2*3 

0-4000 

MR_T2 

2  5  62*52 

[0.78-0. 86]2*3 

0-4000 

III.  Results 

Our  registration  results  were  shown  in  table  II  and  table 
III  which  had  been  evaluated  by  Vanderbilt  University. 

TABLE  II 

THE  REGISTRATION  ERROR  OF  CT-MR  (  unit:  mm) 


CT-PD 

CT-T1 

CT-T2 

CT-PDr 

CT-Tlr 

CT-T2r 

mean  2.28 

1.73 

2.20 

0.78 

0.78 

1.04 

SEM  0.08 

0.06 

0.94 

0.07 

0.06 

0.09 

SD  0.77 

0.77 

1.15 

0.52 

0.42 

0.72 

TABLE  III 

THE  REGISTRATION  ERROR  OF 

PET-MR  ( unit:  mm) 

PET- 

PET¬ 

PET- 

PET- 

PET- 

PET- 

PD 

TI 

T2 

PDr 

Tlr 

T2r 

mean  4.68 

3.54 

3.27 

2.36 

2.42 

3.02 

SEM  0.40 

0.27 

0.19 

0.19 

0.18 

0.29 

SD  2.91 

2.13 

1.51 

1.29 

1.12 

2.02 
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Since  the  voxel  sizes  of  images  F  and  R  were  different,  a 
virtual  voxel  size  was  defined  for  compare.  If  the  registration 
error  was  smaller  than  the  virtual  voxel  size,  the  registration 
accuracy  was  sub-voxel.  The  length  of  diagonal  of  MR  was 
defined  when  CT  and  MR  were  matching  and  that  of  PET 
was  defined  when  PET  and  MR  were  matching  as  the  size  of 
the  virtual  voxel.  That  was, 

Vl.252  +1.252  +4.02  «  4.373  (mm)  and 

V2.592  +2.592  +8.02  ~  8.80 (mm)  for  each  case.  In  table  II 
and  table  III,  the  mean  error  and  SD  were  far  smaller  than  the 
respective  virtual  voxel  size. 

TV.  Discussion 

Practice  data  set  and  the  standard  results  of  which  were 
provided  by  Vanderbilt  University  for  mistake  checking.  It 
had  been  used  for  some  experiments  and  visual  inspection  in 
registration. 

A.  Embedding  of  Mutual  Information 

We  used  both  standard  formulation  of  mutual  information 
and  a  normalized  form.  It  was  shown  in  our  experiment  that 
both  performed  fairly  well,  but  it  seemed  that  higher  accuracy 
was  achieved  with  normalized  mutual  information.  So  the 
normalized  form  was  used  for  all  the  data. 

B.  Sub-sampling  and  Multi-resolution 

The  whole  image  was  chosen  to  be  the  sampling  range 
and  the  result  showed  that  the  accuracy  was  not  as  high  as 
expected.  It  might  due  to  the  partial  volume  effect  and  the 
participation  of  background  artifact,  which  caused  local 
optimum  of  mutual  information.  It  was  better  to  choose  the 
range  described  in  the  methodology  section.  The  sub-voxel 
registration  had  been  achieved  in  the  first  layer  of  the 
pyramid.  Pluim  et.al  showed  that  the  accuracy  of  the  multi¬ 
resolution  matching  was  not  significantly  improved  than  that 
of  direct  registration  as  observed  in  our  experiment.  But  it 
reduced  the  computation  time.  For  that  reason,  the  multi¬ 
resolution  strategy  was  chosen. 

C.  Pre-processing 

The  results  showed  that  when  the  PET  images  were  not 
pre-segmented,  the  registration  error  was  sub-voxel,  but  was 
bigger  than  that  achieved  by  pre-segmented  PET. 

V.  Conclusion 

Our  results  demonstrated  that  sub-voxel  multi-modal 


registration  accuracy  had  been  achieved  using  the 
maximization  of  normalized  mutual  information,  which  made 
this  method  suited  for  clinical  applications. 
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